首页> 外文OA文献 >Guaranteed clustering and biclustering via semidefinite programming
【2h】

Guaranteed clustering and biclustering via semidefinite programming

机译:通过半定编程保证聚类和双聚类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identifying clusters of similar objects in data plays a significant role in a wide range of applications. As a model problem for clustering, we consider the densest k -disjoint-clique problem, whose goal is to identify the collection of k disjoint cliques of a given weighted complete graph maximizing the sum of the densities of the complete subgraphs induced by these cliques. In this paper, we establish conditions ensuring exact recovery of the densest k cliques of a given graph from the optimal solution of a particular semidefinite program. In particular, the semidefinite relaxation is exact for input graphs corresponding to data consisting of k large, distinct clusters and a smaller number of outliers. This approach also yields a semidefinite relaxation with similar recovery guarantees for the biclustering problem. Given a set of objects and a set of features exhibited by these objects, biclustering seeks to simultaneously group the objects and features according to their expression levels. This problem may be posed as that of partitioning the nodes of a weighted bipartite complete graph such that the sum of the densities of the resulting bipartite complete subgraphs is maximized. As in our analysis of the densest k -disjoint-clique problem, we show that the correct partition of the objects and features can be recovered from the optimal solution of a semidefinite program in the case that the given data consists of several disjoint sets of objects exhibiting similar features. Empirical evidence from numerical experiments supporting these theoretical guarantees is also provided.
机译:识别数据中相似对象的簇在广泛的应用中起着重要的作用。作为聚类的模型问题,我们考虑最密集的k-不相交-clique问题,其目标是识别给定加权完整图的k个不相交的集团,以最大程度地提高由这些集团诱导的完整子图的密度之和。在本文中,我们建立了确保从特定半确定程序的最佳解中准确恢复给定图的最密集k团的条件。特别是,半确定松弛对于与包含k个巨大的不同簇和较少数量的异常值的数据相对应的输入图而言是精确的。这种方法还产生了半确定的松弛,并为双簇问题提供了类似的恢复保证。给定一组对象和这些对象展示的一组特征,双聚类法试图根据对象和特征的表达水平同时对它们进行分组。可能会出现这样的问题,即对加权二部完全图的节点进行划分,以使生成的二部完全子图的密度之和最大。正如我们对最密集的k-不相交-clique问题的分析一样,我们表明,在给定数据由几个不相交的对象集组成的情况下,可以从半定程序的最优解中恢复对象和特征的正确分区表现出相似的特征。还提供了来自支持这些理论保证的数值实验的经验证据。

著录项

  • 作者

    Ames, Brendan P. W.;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号